Fast semi-supervised evidential clustering
نویسندگان
چکیده
Semi-supervised clustering is a constrained technique that organizes collection of unlabeled data into homogeneous subgroups with the help domain knowledge expressed as constraints. These methods are, most time, variants popular k-means algorithm. As such, they are based on criterion to minimize. Amongst existing semi-supervised clusterings, Evidential Clustering (SECM) deals problem uncertain/imprecise labels and creates credal partition. In this work, new heuristic algorithm, called SECM-h, presented. The proposed algorithm relaxes constraints SECM in such way optimization solved using Lagrangian method. Experimental results show largely improves execution time while accuracy maintained.
منابع مشابه
Fast Randomized Semi-Supervised Clustering
We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be ...
متن کاملSemi-supervised Clustering
Clustering is an unsupervised learning problem whose objective is to find a partition of the given data. However, a major challenge in clustering is to define an appropriate objective function in order to to find an optimal partition that is useful to the user. To facilitate data clustering, it has been suggested that the user provide some supplementary information about the data (eg. pairwise ...
متن کاملSemi-Supervised Projected Clustering
Recent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we...
متن کاملSemi-supervised clustering methods
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, informat...
متن کاملSemi supervised clustering for Text Clustering
ABSTRACT: Based on clustering algorithm Affinity Propagation (AP) I present this paper a semisupervised text clustering algorithm, called Seeds Affinity Propagation (SAP). There are two main contributions in my approach: 1) a similarity metric that captures the structural information of texts, and 2) seed construction method to improve the semisupervised clustering process. To study the perform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Approximate Reasoning
سال: 2021
ISSN: ['1873-4731', '0888-613X']
DOI: https://doi.org/10.1016/j.ijar.2021.03.008